Blame assignment for errors made by large vocabulary speech recognizers
نویسنده
چکیده
This paper describes an approach to identifying the reasons that speech recognition errors occur. The algorithm presented requires an accurate word transcript of the utterances being analyzed. It places errors into one of the categories: 1) due to outof-vocabulary (OOV) word spoken, 2) search error, 3) homophone substitution, 4) language model overwhelming correct acoustics, 5) transcript/pronunciation problems, 6) confused acoustic models, or 7) miscellaneous/not possible to categorize. Some categorizations of errors can supply training data to automatic corrective training methods that refine acoustic models. Other errors supply language model and lexicon designers with examples that identify potential improvements. The algorithm is described and results on the combined evaluation test sets from 1992-1995 of the North American Business (NAB) [1] [2] [3] corpus using the Sphinx-II recognizer [4] are presented.
منابع مشابه
Why speech recognizers make errors ? a robustness view
The performance of large vocabulary speech recognizers often varies depending on the input speech and the quality of the trained models. The particular attributes that cause recognition errors are a research area that has not been well studied. This paper addresses this issue from a robustness perspective using a large amount of field data collected from natural language dialog services. In par...
متن کاملJapanese spoken term detection using syllable transition network derived from multiple speech recognizers' outputs
This paper proposes a spoken term detection using syllable transition network (STN) derived from multiple speech recognizers. An STN is similar to a sub-word based confusion network, which is derived from the output of a speech recognizer. The one we proposed is derived from the outputs of multiple speech recognition systems, which is well known to be robust to certain recognition errors and th...
متن کاملFactorization of Language Constraints in Speech Recognition
Integration of language constraints into a large vocabulary speech recognition system often leads to prohibitive complexity. We propose to factor the constraints into two components. The first is characterized by a covering grammar which is small and easily integrated into existing speech recognizers. The recognized string is then decoded by means of an efficient language post-processor in whic...
متن کاملAutomatic Generation of Pronunciation Dictionaries
In this report we will describe a data driven approach for creating pronunciation dictionaries for a new unseen target language by voting among phoneme recognizers in nine different languages other than the target language. In this process recordings of the new language that are transcribed on word level are decoded by the phoneme recognizers. This results in a hypothesis of nine phonemes per t...
متن کاملSpoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs
Spoken Term Detection (STD) that considers the out-of-vocabulary (OOV) problem has generated significant interest in the field of spoken document processing. This study describes STD with false detection control using phoneme transition networks (PTNs) derived from the outputs of multiple speech recognizers. PTNs are similar to subword-based confusion networks (CNs), which are originally derive...
متن کامل